-
Notifications
You must be signed in to change notification settings - Fork 4.1k
[STORM-2201] Add dynamic scheduler configuration loading #2199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @HeartSaVioR @revans2, could you please review this when you get a chance? Thanks very much! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ethanlm Sorry to visit too late. This is a bit huge and I often lost my concentration.
Left some comments but looks good overall. The blocker thing is that I have no idea what artifactory is.
Is it internal proprietary? If then it should be better to generalize (or create another one) ArtifactoryConfigLoader to accept generic http server implementation. OK to not support artifact's' feature in http server implementation.
@revans2 What do you think?
|
|
||
| @SuppressWarnings("rawtypes") | ||
| private Map _conf; | ||
| private int _artifactoryPollTimeSecs = 600; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: Maybe better to have a constant (private static final) for 600.
| private boolean _cacheInitialized = false; | ||
| // Location of the file in the artifactory archive. Also used to name file in cache. | ||
| private String _localCacheDir; | ||
| private String _artifactoryScheme = "http"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: Maybe better to have a constant (private static final) for http.
| // Location of the file in the artifactory archive. Also used to name file in cache. | ||
| private String _localCacheDir; | ||
| private String _artifactoryScheme = "http"; | ||
| private String _baseDirectory = "/artifactory"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: Maybe better to have a constant (private static final) for /artifactory.
| private String _artifactoryScheme = "http"; | ||
| private String _baseDirectory = "/artifactory"; | ||
| private int _lastReturnedTime = 0; | ||
| private int _timeoutSeconds = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: Maybe better to have a constant (private static final) for 10.
| HttpEntity entity = response.getEntity(); | ||
| return entity != null ? EntityUtils.toString(entity) : null; | ||
| } else { | ||
| LOG.error("Got unexpected response code {}", status); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe better to include the content of entity to error log as well if the response has an entity even though it was not successful.
| tmpDirPath = Files.createTempDirectory("TestArtifactoryConfigLoader"); | ||
| File f = tmpDirPath.toFile(); | ||
| f.mkdir(); | ||
| File dir=new File(f, "nimbus"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both spaces around =
| dir.mkdir(); | ||
| } | ||
|
|
||
| private void recursiveDeleteFile(File dir) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could just delete it with FileUtils.deleteDirectory() if we're relying on Apache commons-io now. If not it's OK to keep this as it is.
| } | ||
|
|
||
| @Test | ||
| public void testInvalidConfig() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like redundant with testInvalid().
| fw.close(); | ||
|
|
||
| Config config = new Config(); | ||
| config.put(ArtifactoryConfigLoader.ARTIFACTORY_URI, "file://"+temp.getCanonicalPath()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both spaces around +
|
|
||
| Assert.assertNotNull("Unexpectedly returned null", result2); | ||
|
|
||
| // Shouldn't change yet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove comment.
|
@HeartSaVioR Thanks for the review. I will address your comments ASAP but it will take me a little longer to catch up this PR |
|
@HeartSaVioR Thanks for the comments! I rebased the branch and addressed some of your comments. I am still working on the rest of them |
|
I refactored the code. It simplifies configurations a lot.
Doing some more testing, especially for |
|
@Ethanlm Please mention with cc. to proper folk(s) when you think you couldn't answer this question. |
|
@HeartSaVioR You may want to check with this website. My understanding is that we can get most recent file from artifactory server. It's not the same with normal http server. |
|
@Ethanlm Thanks for the information. Looks like we can't expect users to use artifactory. If possible we'd be better to have HttpConfigLoader implementation, but may be not easy to implement same for artifactory like pulling most recent file. We can start with just getting full URL for the config file, and pull it periodically. |
|
@revans2 Do you have any suggestions? Thanks. |
revans2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall it looks really good. Just 2 comments about how I think we can make it more maintainable/extensible in the future.
| String scheme = uri.getScheme(); | ||
| switch (SchemeType.toSchemeType(scheme)) { | ||
| case FILE: | ||
| return new FileConfigLoader(loaderParams); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me it feels like this would be a great place to use a ServiceLoader instead of having a hard coded FileConfigLoader or ArtifactoryHttpConfigLoader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been thinking about using a ServiceLoader and tried so. But I realized that we actually want to have only one type of loader working in the mean time (one loader for one type of scheme). So I think switch-case here seems cleaner.
The way I can think of applying ServiceLoader here is to have every type of loaders check the scheme in their load() function and then either directly return null (because they don't deal with this type of scheme) or return the result. Something like:
For HttpConfigLoader
Map load(loaderParams) {
scheme = getScheme();
if (!scheme.equalsIgnoreCase("http")) {
return null;
}
//else: process and return result;
}
For ArtifactoryHttpConfigLoader,
Map load(loaderParams) {
scheme = getScheme();
if (!scheme.equalsIgnoreCase("artifactory+http")) {
return null;
}
//else: process and return result;
}
Is this what you are talking about? Sorry if I mis-understood it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I was thinking we might want to use a factory pattern so we can cache connections and other things in the IConfigLoader implementation. Something perhaps like.
public interface IConfigLoaderFactory {
IConfigLoader createIfSupported(URI uri, Map<String, Object> conf);
private static ServiceLoader<IConfigLoaderFactory> serviceLoader = ServiceLoader.load(IConfigLoaderFactory.class);
public static IConfigLoader open(URI uri, Map<String, Object> conf) {
for (IConfigLoaderFactory factory: serviceLoader) {
IConfigLoaderFactory ret = factory.createIfSupported(uri, conf);
if (ret != null) {
return ret;
}
}
throw some exception();
}
}
public class HttpArtifactoryServiceLoaderFactory implements IConfigLoaderFactory{
public IConfigLoader createIfSupported(URI uri, Map<String, Object> conf);
if (!"artifactory+http".equals(uri.getScheme())) {
return null;
}
return new ...;
}
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a great idea to me. I will try to do so. Thanks
| * A dynamic loader that can load scheduler configurations for user resource guarantees from Artifactory (an artifact repository manager). | ||
| */ | ||
| public class ArtifactoryHttpConfigLoader implements IConfigLoader { | ||
| protected static final String ARTIFACTORY_BASE_DIRECTORY = "artifactory.config.loader.base.directory"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have configs in here can we tag them with the config validation annotations?
@isSting
And also register it with the config service loader?
import org.apache.storm.validation.Validated;
...
ArtifactoryHttpConfigLoader implements IConfigLoader, Validated {
and then append org.apache.storm.scheduler.utils.ArtifactoryHttpConfigLoader to ./storm-server/src/main/resources/META-INF/services/org.apache.storm.validation.Validated
|
|
||
| String SCHEDULER_CONFIG_LOADER_URI = "scheduler.config.loader.uri"; | ||
| String SCHEDULER_CONFIG_LOADER_POLLTIME_SECS = "scheduler.config.loader.polltime.secs"; | ||
| String SCHEDULER_CONFIG_LOADER_TIMEOUT_SECS = "scheduler.config.loader.timeout.secs"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here too can we annotate these with config validation annotations, and make IConfigLoader part of the Config Validation service loader?
|
@Ethanlm |
|
@HeartSaVioR Yes, please. But this patch is not ready yet. I am going to do some tests manually on ArtifactoryHttpConfigLoader and also add some comments to the source code. Thanks. (I tested FileConfigLoader manually) |
|
@Ethanlm Sure, please leave a comment when you are done. I'll take a look at further changes too. |
| this.conf = conf; | ||
| schedulingPrioritystrategy = (ISchedulingPriorityStrategy) ReflectionUtils.newInstance( | ||
| (String) conf.get(DaemonConfig.RESOURCE_AWARE_SCHEDULER_PRIORITY_STRATEGY)); | ||
| (String) conf.get(DaemonConfig.RESOURCE_AWARE_SCHEDULER_PRIORITY_STRATEGY)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason why we need to change the spacing in many places in this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No.. It just accidentally happened..
|
I am done with testing and I think this patch is ready for review. Now I have @HeartSaVioR @revans2 @jerrypeng Could you please let me know if you have any suggestions on this patch when you get a chance. Thanks very much! I want to explain a bit about To use We can query the REST API We have It the URI points to a file, the metadata will be like this: where we can get the "downloadUri" and then download that file. |
revans2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is looking good.
| <dependency> | ||
| <groupId>com.google.auto.service</groupId> | ||
| <artifactId>auto-service</artifactId> | ||
| </dependency> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: reading through the documentation for this it indicates that we should mark this as <optional>true</optional>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressing it.
|
|
||
| /** | ||
| * Configuration elements for scheduler config loader. | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I made a mistake. I thought initially that each of the configs in ConfigLoaderConfiguration were regular configs under storm.yaml. I should have read the code more closely, because they actually are sub-configs under "scheduler.config.loader.params".
I don't really like them being sub configs because they don't get validated. Which was the entire point of me asking you to add them into the Validated ServiceLoader. I also don't see a lot of value in having
scheduler.config.loader.params:
scheduler.config.loader.uri: "artifactory+http://artifactory.my.company.com:8000/artifactory/configurations/clusters/my_cluster/ras_pools"
scheduler.config.loader.timeout.sec: 30over
scheduler.config.loader.uri: "artifactory+http://artifactory.my.company.com:8000/artifactory/configurations/clusters/my_cluster/ras_pools"
scheduler.config.loader.timeout.sec: 30Could we change it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. I also felt that scheduler.config.loader.params makes things unnecessarily complicated. Changing it
| targetURI = new URI(uriString); | ||
| scheme = targetURI.getScheme().substring(ARTIFACTORY_SCHEME_PREFIX.length()); | ||
| } catch(URISyntaxException e) { | ||
| LOG.error("Failed to parse uri={}", uriString); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to throw an exception here. I don't think we can or should recover from a bad URI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My movitation here is that if the ConfigLoader doesn't work correctly, we will just return null. Then scheduler will fall back to use "multitenant-scheduler.xml" file, or the configuration from storm.yaml. But I'm not sure if it's good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally would prefer to see it fail fast here and not come up if the URI is not formatted correctly.
| if (downloadURI != null) { | ||
| // Then get it and return the file as string. | ||
| String returnValue = doGet(null, location, host, port); | ||
| saveInArtifactoryCache(returnValue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
saveInArtifactoryCache checks for null. I know it is not as readable here, but it does look like it works.
|
Tested it again. Should be good now. |
revans2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am +1 on this change. I would like to see us fail fast on the config issue, but even without it I am fine with the code as is.
|
@HeartSaVioR you took a look at this patch before, do you want to take another look at it? @Ethanlm could you squash your commits into just one? |
|
@revans2 still +1. |


This adds an interface and two implementations, one that will load from a local file, and another
that will load a config from artifactory.
It also modifies the ResourceAwareScheduler and Multitenant schedulers to have a plugin interface and configuration entries for the plugin.
Unit tests are also added for the implementations.
See: #1785
Also fixed some checkstyle issues.